Show me the money. Time to wrangle Barchart futures data. Let’s see if I can make practice out of theory.
I got a free trial of BarChart and have downloaded some data. Reminders on the month codes: ZCU23 is September 23, ZCZ23 is December 23, ZCH is March 24, ZCK is May.
Check these are good, as I hacked off the last row with some “watermarking” from barchart
zcu23_daily_historical_data_07_08_2023 <- read_csv("zcu23_daily_historical-data-07-08-2023.csv",
col_types = cols(Time = col_date(format = "%m/%d/%Y")))
zch24_daily_historical_data_07_08_2023 <- read_csv("zch24_daily_historical-data-07-08-2023.csv",
col_types = cols(Time = col_date(format = "%m/%d/%Y")))
ggplot(data = zch24_daily_historical_data_07_08_2023, aes(x = Time, y = High, color = "High"))+geom_line()+theme_minimal()+geom_line(aes(x=Time, y= Open, color = "Open"))+geom_line(aes(x=Time, y= Last, color = "Last"))
Can compare the two series
ggplot(data = zch24_daily_historical_data_07_08_2023, aes(x = Time, y = High, color = "24"))+geom_line()+theme_minimal()+geom_line(data = zcu23_daily_historical_data_07_08_2023, aes(x=Time, y = High, color = "23"))
Ok, great, as a test case we’re in business. I will download a bunch more stuff. Next to-dos:
Before coffee, look at prices!
zcz23_price_history_07_09_2023 <- read_csv("zcz23_price-history-07-09-2023.csv",
col_types = cols(Time = col_date(format = "%m/%d/%Y")))
Let’s plot price history against futures prices. I tried to find the cash price so I can get some proxy for the basis, but Barchart doesn’t explain exactly what I’m getting here with ZCY00. From FarmProgress I can see the basis at particular elevators. How deep do I want to get into this? I’d love to figure out how to set up some bets based on what’ll happen in the freight markets in the fall/winter, but that may be more granular than I can manage here.
ZCY00_Barchart_Interactive_Chart_Daily_07_09_2023 <- read_csv("ZCY00_Barchart_Interactive_Chart_Daily_07_09_2023.csv",
col_types = cols(`Date Time` = col_date(format = "%Y-%m-%d")),
skip = 1)
ggplot(data = zch24_daily_historical_data_07_08_2023, aes(x = Time, y = High, color = "March 24"))+geom_line()+theme_minimal()+geom_line(data = zcu23_daily_historical_data_07_08_2023, aes(x=Time, y = High, color = "Sept 23"))+geom_line(data = ZCY00_Barchart_Interactive_Chart_Daily_07_09_2023, aes(x=`Date Time`, y = High, color = "cash history"))
Let’s also compare with the “nearby” series I get from Barchart, where
they stitch together nearest-expiration futures.
zcz23_daily_nearby_historical_data_07_12_2023 <- read_csv("zcz23_daily-nearby_historical-data-07-12-2023.csv", col_types = cols(Time = col_date(format = "%m/%d/%Y")))
ggplot(data = ZCY00_Barchart_Interactive_Chart_Daily_07_09_2023, aes(x=`Date Time`, y = High, color = "cash history"))+geom_line()+theme_minimal()+geom_line(data = zcz23_daily_nearby_historical_data_07_12_2023, aes(x = Time, y = High, color = "nearby"))
Like, is this the nonconvergence I heard about, or is there basis in here, or what? I gotta read up on this. What’s with the seasonal divergence there, too? Let’s put this aside and move on, but I have a lot of questions.
#zcz23_daily_nearby_historical_data_07_12_2023
ggplot(data = zch24_daily_historical_data_07_08_2023, aes(x = Time, y = High, color = "24"))+geom_line()+theme_minimal()+geom_line(data = zcu23_daily_historical_data_07_08_2023, aes(x=Time, y = High, color = "23"))
I’m getting the sense that the juice is in the differences. Let’s take a
category theory approach, look at the relationships, haha.
sept_dec_corn_diff = merge(zcz23_price_history_07_09_2023, zcu23_daily_historical_data_07_08_2023, by = "Time", suffix = c("_zcz23", "_zcu23"))
ggplot(data = sept_dec_corn_diff, aes(x=Time, y = Open_zcz23))+geom_line()+theme_minimal()+ geom_line(aes(x = Time, y = Open_zcu23))
Ok, look at the differences
sept_dec_corn_diff$sept_less_dec_open = sept_dec_corn_diff$Open_zcu23 - sept_dec_corn_diff$Open_zcz23
ggplot(sept_dec_corn_diff, aes(x = Time, y = sept_less_dec_open))+geom_line()+theme_minimal()
Oh interesting. Hm. What happened mid-may?
Let’s do a very first-pass STL decomposition on the stitched long-term series, out of curiosity. I’m going to just build up my intuition here. Ah, all the code I no longer have access to, my boilerplate backtesting…..
library(forecast)
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
library(TSstudio)
# df[nrow(df):1,]
zc_nearby_ts = ts(zcz23_daily_nearby_historical_data_07_12_2023[nrow(zcz23_daily_nearby_historical_data_07_12_2023):1, "Open"], start = decimal_date(as.Date("2000-01-04")), frequency = 252)
# This is wrong, this does not treat trading days correctly
ts_decompose(zc_nearby_ts)
That’s fair. Some change in level. But take a look at the random part – significant room for improvement here by looking at the cyclicality there.